Search Results: "glandium"

11 November 2014

Mike Hommey: Building a Firefox Debian package

It s actually been possible for some time, but I made that simpler recently, and I figured I should mention it.

3 October 2014

Mike Hommey: No PIE for you!

You are a software vendor. You distribute software on multiple operating systems. Let s say your software is a mildly popular internet browser. Let s say its logo represents an animal and a globe. Now, because you care about the security of your users, let s say you would like the entire address space of your application to be randomized, including the main executable portion of it. That would be neat, wouldn t it? And there s even a feature for that: Position independent executables. You get that working on (almost) all the operating systems you distribute software on. Great. Then a Gnome user (or an Ubuntu user, for that matter) comes, and tells you they downloaded your software tarball, unpacked it, and tried opening your software, but all they get is a dialog telling them:
Could not display application-name
There is no application installed for shared library files
Because, you see, a Position independent executable, in ELF terms, is actually a (position independent) shared library that happens to be executable, instead of being an executable that happens to be position independent. And nautilus (the file manager in Gnome and Ubuntu s Unity) usefully knows to distinguish between executables and shared libraries. And will happily refuse to execute shared libraries, even when they have the file-system-level executable bit set. You d think you can get around this by using a .desktop file, but the Exec field in those files requires a full path. (No, ./ doesn t work unless the executable is in the nautilus process current working directory, as in, the path nautilus was run from) Dear lazyweb, please prove me wrong and tell me there s a way around this.

25 September 2014

Mike Hommey: So, hum, bash

So, I guess you heard about the latest bash hole. What baffles me is that the following still is allowed:
env echo='() xterm; ' bash -c "echo this is a test"
Interesting replacements for echo , xterm and echo this is a test are left as an exercise to the reader. Update: Another thing that bugs me: Why is this feature even enabled in posix mode? (the mode you get from bash --posix, or, more importantly, when running bash as sh) After all, export -f is a bashism.

2 July 2014

Mike Hommey: Firefox and Gtk+ 3

Folks from Collabora and Red Hat have been working on making Firefox on Gtk+ 3 a thing. See Emilio s blog post for some recent update. But getting Firefox to build and run locally is unfortunately not the whole story. I ve been working on getting Gtk+ 3 Firefox builds going on Mozilla build infrastructure, and I m proud to announce today that those builds are now going through Mozilla continuous integration on a project branch: Elm, and receive the same automated testing as mozilla-central. And when I said getting Firefox to build and run was unfortunately not the whole story, I meant it: if you click on the Elm link above, you ll notice that there s a lot of orange, when it should be all green. So, yes, Firefox on Gtk+ 3 is a thing, and it now has continuous integration. But there s still a whole bunch of things to fix. So if you re interested in making those builds work better, you can hop in, there are many things you can do:

5 April 2014

Mike Hommey:

I started learning japanese calligraphy a few months ago, with no prior experience with a brush and ink. It is an interesting endeavour. For various reasons, I had to skip class for a few weeks, but after the past ten days, I needed some stress relief on paper.

23 November 2013

Mike Hommey: Don t trust python s os.execv

Python is nice and all, but its low-level functions have real disruptive discrepancies between platforms. Case at point:
import os
os.execvp("sh", ["sh", "-c", "exit 1"])
As a UNIXy person, I d expect running the above script to return an error code of 1. And I would be perfectly right on UNIX systems. On Windows, it returns 0. You d think such a difference in behavior would be documented? It s not. Thank you python.

30 May 2013

Mike Hommey:

Today, May the 30th, was my last day as a Mozilla employee. In a couple weeks, my wife, my cat and I will be on board of a flight heading about ten thousand kilometers east, and most of our stuff will be in some container on a boat. We re moving to Japan. As adventurous as this may sound, I m not venturing into unknown territory. My wife is Japanese, and I ve lived there for close to 15 months. A long time ago, arguably. I m not actually leaving Mozilla. I ll be back as a contractor, hopefully around the 25th of June. So as far as my fellow coworkers are concerned, I ll be going on a long-ish vacation and changing timezone (but I ll probably be around in the meanwhile on irc or bugmail, with high latency). Jump-starting in a different country is not something really easy to pull off, and working for Mozilla as a remotee has been a key element in being able to do so. Although I ve made it clear when I joined Mozilla that this would eventually happen, I m thankful I can now actually do it.

27 May 2013

Mike Hommey: signal() doubly considered harmful

When you want to set signal handlers on UNIX systems, the typical choice is to use signal (specified in C89, C99 and POSIX.1-2001) or sigaction (specified in POSIX.1-2001 and System V r4). Quoting the signal manual page:
The only portable use of signal() is to set a signal s disposition to SIG_DFL or SIG_IGN. The semantics when using signal() to establish a signal handler vary across systems (and POSIX.1 explicitly permits this variation); do not use it for this purpose. POSIX.1 solved the portability mess by specifying sigaction(2), which provides explicit control of the semantics when a signal handler is invoked; use that interface instead of signal().
Then it goes on about the UNIX vs BSD semantics, and how they affect signal delivery, which essentially is the main reason why one would want to stop using signal and use sigaction instead, with specifically chosen flags. But this is not really what I wanted to talk about here. One of the uses of signal or sigaction is to temporarily set a signal handler and restore the old signal handler once the job is done. Notwithstanding the fact that it s a pretty horrible thing to do in a multi-threaded program, it s also a horrible thing to do at all with signal if sigaction is used. The core of the problem is the following: the information you get from signal() about the old signal handler is missing all the important pieces about it if it was originally set with sigaction(), namely, flags, masks and restorer. So if you do use signal() to temporarily set a signal handler and then restore the previous signal handler, you risk resetting flags, masks and restorer. The first awful thing this means is the previous signal handler might be expecting three arguments, only one of which will be valid when it s invoked. Unexpected things can also happen with the lack of expected flags or masks. This is why you ll see horrible workarounds like this or that. In short, if you do use signal() to temporarily set a signal handler and then restore the previous signal handler, you re doing it wrong. And if you do that in a system library or driver, thank you for screwing things up. I m looking at you libsc-a3xx.so.

14 March 2013

Mike Hommey: Google Reader death, or how the cloud model can fail you

If you re a Google Reader user, you probably read in one of your subscriptions that Google is pulling the plug on Google Reader. It is yet another demonstration of why putting data in the cloud isn t so much of a nice idea: the service you rely on may well disappear some day, with all the data it contains. Sure Google, in its extreme goodness, allows you to take out the Google Reader data. Or does it?
These are what you ll get from Google Takeout for Reader: Interestingly, while looking into shared-by-followers.json, I found urls that would correspond to friend streams. For instance, Tim Bray s is http://www.google.com/reader/public/atom/user/05198174665841271019/state/com.google/broadcast. But it s useless: all it displays is permission denied . As for subscriptions, one of the strengths of Google Reader is that it allowed to search though past items, which means a big part of the interesting data is the archived items. But that s not part of the take out . Sure, you have the feed urls, but most RSS feeds contain a limited amount of items, not the entire history of items for the given feed. So, history is more or less lost. Except if I star, like or share all items in all my subscriptions and take out again. So much goodness. It could have been worse, though.

19 February 2013

Mike Hommey: Ten years

Ten years ago, this very day, my first Debian package entered the Debian unstable repository. It was an addon for Mozilla Composer, Daniel Glazman s Cascades. On the same day, my second Debian package entered the Debian unstable repository as well. It was an addon for Mozilla Browser, Checky. A few days later, my third Debian package entered Debian unstable. It was an addon for Mozilla Browser, Diggler. Do you see a pattern? They are now abandoned software, although I made Checky and Diggler live a little longer (and I m actually considering reviving Diggler) but they had their importance in my journey, and are part of the reason why I am where I am now. My journey on the web started with NCSA Mosaic on VAX/VMS, then continued with Netscape Navigator, Netscape Communicator and Mozilla Suite on Linux. That s where I was ten years ago, sailing between Galeon (a browser using the Mozilla engine) and Mozilla Suite, and filing some layout bugs. Ten years ago, there was a new kid on the block. It used to be called Phoenix, it had just changed its name to Firebird. Eventually, it changed again for Firefox. You may have heard about it. Because Firebird was so much nicer than the browser in the Mozilla Suite, I started using its Debian package, and wanted my packaged addons to work with it. So I contacted Eric Dorland, Phoenix/Firebird package maintainer at the time, and got the addons working. I then ended up fixing a bunch of packaging issues. This is how I got involved in Firefox packaging for Debian, and what eventually led me to work for Mozilla.

29 December 2012

Mike Hommey: Firefox in Debian?

Got your attention? Don t hold your breath, we re not there yet, but we re a step closer: it s now possible to build Firefox from the Iceweasel package, since version 17.0.1-2 in experimental as of writing, 18.0~b6-1 from the iceweasel-beta repository, or 19.0~a2+20121228042015-1 from the iceweasel-aurora repository. Before letting you know how you can get yourself a packaged Firefox based on the Iceweasel source, I ll remind you that redistribution of Firefox packages requires a trademark license from Mozilla, so please keep the packages you build for yourself for now. That being said, now it s clear that such Firefox packages are not official, you can still test them for yourself. First download the Iceweasel source version of your liking, and extract it, then rename all source files from iceweasel_* to firefox_* (rename s/iceweasel/firefox/ iceweasel_* should do it). Edit debian/changelog so that the first line reads:
firefox (x.y.z-r) distribution; urgency=low
instead of:
iceweasel (x.y.z-r) distribution; urgency=low
and run the following command:
$ debian/rules debian/control
Now you re all set. You can build the package the usual way. Note there are a few differences between the xulrunner packages you get from building Iceweasel vs. from building Firefox that need to be addressed, and a few other details to sort out.

18 November 2012

Mike Hommey: Debian EFI mode boot on a Macbook Pro, without rEFIt

Diego s post got me to switch from grub-pc to grub-efi to boot Debian on my Macbook Pro. But I wanted to go further: getting rid of rEFIt. rEFIt is a pretty useful piece of software, but it s essentially dead. There is the rEFInd fork, which keeps it up-to-date, but it doesn t really help with FileVault. Moreover, the boot sequence for a Linux distro with rEFIt/rEFInd looks like: Apple EFI firmware rEFIt/rEFInd GRUB Linux kernel. Each intermediate step adding its own timeout, so rEFIt/rEFInd can be seen as not-so-useful intermediate step. Thankfully, Matthew Garrett did all the research to allow to directly boot GRUB from the Apple EFI firmware. Unfortunately, his blog post didn t have much actual detail on how to do it. So here it is, for a Debian system: Now, the Apple Boot Manager, shown when holding down the option key when booting the Macbook Pro, looks like this:
And the Startup disk preferences dialog under OSX, like this:

6 August 2012

Mike Hommey: Building a Linux kernel module without the exact kernel headers

Imagine you have a Linux kernel image for an Android phone, but you don t have the corresponding source, nor do you have the corresponding kernel headers. Imagine that kernel has module support (fortunately), and that you d like to build a module for it to load. There are several good reasons why you can t just build a new kernel from source and be done with it (e.g. the resulting kernel lacks support for important hardware, like the LCD or touchscreen). With the ever-changing Linux kernel ABI, and the lack of source and headers, you d think you re pretty much in a dead-end. As a matter of fact, if you build a kernel module against different kernel headers, the module will fail to load with errors depending on how different they are. It can complain about bad signatures, bad version or other different things. But more on that later. Configuring a kernel The first thing is to find a kernel source for something close enough to the kernel image you have. That s probably the trickiest part with getting a proper configuration. Start from the version number you can read from /proc/version. If, like me, you re targeting an Android device, try Android kernels from Code Aurora, Linaro, Cyanogen or Android, whichever is closest to what is in your phone. In my case, it was msm-3.0 kernel. Note you don t necessarily need the exact same version. A minor version difference is still likely to work. I ve been using a 3.0.21 source, which the kernel image was 3.0.8. Don t however try e.g. using a 3.1 kernel source when the kernel you have is 3.0.x. If the kernel image you have is kind enough to provide a /proc/config.gz file, you can start from there, otherwise, you can try starting from the default configuration, but you need to be extra careful, then (although I won t detail using the default configuration because I was fortunate enough that I didn t have to, there will be some details further below as to why a proper configuration is important). Assuming arm-eabi-gcc is in your PATH, and that you have a shell opened in the kernel source directory, you need to start by configuring the kernel and install headers and scripts:
$ mkdir build
$ gunzip -c config.gz > build/.config # Or whatever you need to prepare a .config
$ make silentoldconfig prepare headers_install scripts ARCH=arm CROSS_COMPILE=arm-eabi- O=build KERNELRELEASE= adb shell uname -r 
The silentoldconfig target is likely to ask you some questions about whether you want to enable some things. You may want to opt for the default, but that may also not work properly. You may use something different for KERNELRELEASE, but it needs to match the exact kernel version you ll be loading the module from. A simple module To create a dummy module, you need to create two files: a source file, and a Makefile. Place the following content in a hello.c file, in some dedicated directory:
#include <linux/module.h>       /* Needed by all modules */
#include <linux/kernel.h>       /* Needed for KERN_INFO */
#include <linux/init.h>         /* Needed for the macros */
static int __init hello_start(void)
 
  printk(KERN_INFO "Hello world\n");
  return 0;
 
static void __exit hello_end(void)
 
  printk(KERN_INFO "Goodbye world\n");
 
module_init(hello_start);
module_exit(hello_end);
Place the following content in a Makefile under the same directory:
obj-m = hello.o
Building such a module is pretty straightforward, but at this point, it won t work yet. Let me enter some details first. The building of a module When you normally build the above module, the kernel build system creates a hello.mod.c file, which content can create several kind of problems:
MODULE_INFO(vermagic, VERMAGIC_STRING);
VERMAGIC_STRING is derived from the UTS_RELEASE macro defined in include/generated/utsrelease.h, generated by the kernel build system. By default, its value is derived from the actual kernel version, and git repository status. This is what setting KERNELRELEASE when configuring the kernel above modified. If VERMAGIC_STRING doesn t match the kernel version, loading the module will lead to the following kind of message in dmesg:
hello: version magic '3.0.21-perf-ge728813-00399-gd5fa0c9' should be '3.0.8-perf'
Then, there s the module definition.
struct module __this_module
__attribute__((section(".gnu.linkonce.this_module"))) =  
 .name = KBUILD_MODNAME,
 .init = init_module,
#ifdef CONFIG_MODULE_UNLOAD
 .exit = cleanup_module,
#endif
 .arch = MODULE_ARCH_INIT,
 ;
In itself, this looks benign, but the struct module, defined in include/linux/module.h comes with an unpleasant surprise:
struct module
 
        (...)
#ifdef CONFIG_UNUSED_SYMBOLS
        (...)
#endif
        (...)
        /* Startup function. */
        int (*init)(void);
        (...)
#ifdef CONFIG_GENERIC_BUG
        (...)
#endif
#ifdef CONFIG_KALLSYMS
        (...)
#endif
        (...)
(... plenty more ifdefs ...)
#ifdef CONFIG_MODULE_UNLOAD
        (...)
        /* Destruction function. */
        void (*exit)(void);
        (...)
#endif
        (...)
 
This means for the init pointer to be at the right place, CONFIG_UNUSED_SYMBOLS needs to be defined according to what the kernel image uses. And for the exit pointer, it s CONFIG_GENERIC_BUG, CONFIG_KALLSYMS, CONFIG_SMP, CONFIG_TRACEPOINTS, CONFIG_JUMP_LABEL, CONFIG_TRACING, CONFIG_EVENT_TRACING, CONFIG_FTRACE_MCOUNT_RECORD and CONFIG_MODULE_UNLOAD. Start to understand why you re supposed to use the exact kernel headers matching your kernel? Then, the symbol version definitions:
static const struct modversion_info ____versions[]
__used
__attribute__((section("__versions"))) =  
	  0xsomehex, "module_layout"  ,
	  0xsomehex, "__aeabi_unwind_cpp_pr0"  ,
	  0xsomehex, "printk"  ,
 ;
These come from the Module.symvers file you get with your kernel headers. Each entry represents a symbol the module requires, and what signature it is expected to have. The first symbol, module_layout, varies depending on what struct module looks like, i.e. depending on which of the config options mentioned above are enabled. The second, __aeabi_unwind_cpp_pr0, is an ARM ABI specific function, and the last, is for our printk function calls. The signature for each function symbol may vary depending on the kernel code for that function, and the compiler used to compile the kernel. This means that if you have a kernel you built from source, modules built for that kernel, and rebuild the kernel after modifying e.g. the printk function, even in a compatible way, the modules you built initially won t load with the new kernel. So, if we were to build a kernel from the hopefully close enough source code, with the hopefully close enough configuration, chances are we wouldn t get the same signatures as the binary kernel we have, and it would complain as follows, when loading our module:
hello: disagrees about version of symbol symbol_name
Which means we need a proper Module.symvers corresponding to the binary kernel, which, at the moment, we don t have. Inspecting the kernel Conveniently, since the kernel has to do these verifications when loading modules, it actually contains a list of the symbols it exports, and the corresponding signatures. When the kernel loads a module, it goes through all the symbols the module requires, in order to find them in its own symbol table (or other modules symbol table when the module uses symbols from other modules), and check the corresponding signature. The kernel uses the following function to search in its symbol table (in kernel/module.c):
bool each_symbol_section(bool (*fn)(const struct symsearch *arr,
                                    struct module *owner,
                                    void *data),
                         void *data)
 
        struct module *mod;
        static const struct symsearch arr[] =  
                  __start___ksymtab, __stop___ksymtab, __start___kcrctab,
                  NOT_GPL_ONLY, false  ,
                  __start___ksymtab_gpl, __stop___ksymtab_gpl,
                  __start___kcrctab_gpl,
                  GPL_ONLY, false  ,
                  __start___ksymtab_gpl_future, __stop___ksymtab_gpl_future,
                  __start___kcrctab_gpl_future,
                  WILL_BE_GPL_ONLY, false  ,
#ifdef CONFIG_UNUSED_SYMBOLS
                  __start___ksymtab_unused, __stop___ksymtab_unused,
                  __start___kcrctab_unused,
                  NOT_GPL_ONLY, true  ,
                  __start___ksymtab_unused_gpl, __stop___ksymtab_unused_gpl,
                  __start___kcrctab_unused_gpl,
                  GPL_ONLY, true  ,
#endif
         ;
        if (each_symbol_in_section(arr, ARRAY_SIZE(arr), NULL, fn, data))
                return true;
        (...)
The struct used in this function is defined in include/linux/module.h as follows:
struct symsearch  
        const struct kernel_symbol *start, *stop;
        const unsigned long *crcs;
        enum  
                NOT_GPL_ONLY,
                GPL_ONLY,
                WILL_BE_GPL_ONLY,
          licence;
        bool unused;
 ;
Note: this kernel code hasn t changed significantly in the past four years. What we have above is three (or five, when CONFIG_UNUSED_SYMBOLS is defined) entries, each of which contains the start of a symbol table, the end of that symbol table, the start of the corresponding signature table, and two flags. The data is static and constant, which means it will appear as is in the kernel binary. By scanning the kernel for three consecutive sequences of three pointers within the kernel address space followed by two integers with the values from the definitions in each_symbol_section, we can deduce the location of the symbol and signature tables, and regenerate a Module.symvers from the kernel binary. Unfortunately, most kernels these days are compressed (zImage), so a simple search is not possible. A compressed kernel is actually a small bootstrap binary followed by a compressed stream. It is possible to scan the kernel zImage to look for the compressed stream, and decompress it from there. I wrote a script to do decompression and extraction of the symbols info automatically. It should work on any recent kernel, provided it is not relocatable and you know the base address where it is loaded. It takes options for the number of bits and endianness of the architecture, but defaults to values suitable for ARM. The base address, however, always needs to be provided. It can be found, on ARM kernels, in dmesg:
$ adb shell dmesg   grep "\.init"
<5>[01-01 00:00:00.000] [0: swapper]      .init : 0xc0008000 - 0xc0037000   ( 188 kB)
The base address in the example above is 0xc0008000. If like me you re interested in loading the module on an Android device, then what you have as a binary kernel is probably a complete boot image. A boot image contains other things besides the kernel, so you can t use it directly with the script. Except if the kernel in that boot image is compressed, in which case the part of the script that looks for the compressed image will find it anyways. If the kernel is not compressed, you can use the unbootimg program as outlined in this old post of mine to get the kernel image out of your boot image. Once you have the kernel image, the script can be invoked as follows:
$ python extract-symvers.py -B 0xc0008000 kernel-filename > Module.symvers
Symbols and signature info could also be extracted from binary modules, but I was not interested in that information so the script doesn t handle that. Building our module Now that we have a proper Module.symvers for the kernel we want to load our module in, we can finally build the module: (again, assuming arm-eabi-gcc is in your PATH, and that you have a shell opened in the kernel source directory)
$ cp /path/to/Module.symvers build/
$ make M=/path/to/module/source ARCH=arm CROSS_COMPILE=arm-eabi- O=build modules
And that s it. You can now copy the resulting hello.ko onto the device and load it. and enjoy
$ adb shell
# insmod hello.ko
# dmesg   grep insmod
<6>[mm-dd hh:mm:ss.xxx] [id: insmod]Hello world
# lsmod
hello 586 0 - Live 0xbf008000 (P)
# rmmod hello
# dmesg   grep rmmod
<6>[mm-dd hh:mm:ss.xxx] [id: rmmod]Goodbye world

20 July 2012

Mike Hommey: What is a Web App?

Is it this, this or that?

15 July 2012

Mike Hommey: Comment spam

Three weeks ago, I slightly modified the comment system on this blog for an experiment. This blog is a standard wordpress installation. Comments are normally directed to the wp-comments-post.php script by the HTML form. What I did is: During the past three weeks, on this blog, there were 7170 comments, 8 of which were actual comments. 7162 were spam (~99.9%). This means a large portion of spammers didn t care about actually checking the comment forms and used the standard wordpress url, and another large portion don t run javascript on their bots, although a very few do.

9 June 2012

Mike Hommey: Attempting to close a LinkedIn account

Following the trend, I attempted to close my LinkedIn account. Closing a LinkedIn account involves confirming and confirming and confirming again. Once it s all done, you d expect to, well, be done with it. I m outraged at the result: The only upside is that after I login, I can only see a page saying Your LinkedIn account has been temporarily restricted . Contact our customer service team to get this resolved as soon as possible. Update: After contacting their customer service, the account was closed and the public profile is now unavailable.

2 June 2012

Mike Hommey: Iceweasel ESR in squeeze-backports

In case this went unnoticed, Iceweasel ESR has been available in squeeze-backports for a few weeks, now. I highly recommend anyone using Iceweasel on the Debian stable release to upgrade to that version, at the very least. Even newer versions are available through the pkg-mozilla archive.

25 March 2012

Mike Hommey: Announcing vmfs-tools version 0.2.5

The last release of vmfs-tools (0.2.1) was almost 2 years ago. It was about time to bring some of the changes that have been available in the git repository in an official tarball. So here it is. It brings some limited VMFS 5 support and experimental extent removal, as well as some deep changes to the debugvmfs tool and various fixes. Next release (0.2.6) will have a fixed fsck, which, while it still won t fix file system inconsistencies, should at least report actual inconsistencies (which is far from being true currently). I won t give any estimation as to when this will happen, though.

6 March 2012

Mike Hommey: libgcc.a symbol visibility considered harmful

I recently got to rebuild an Android NDK with a fresh toolchain again, and hit an interesting problem. I actually had hit it before, but only this time I fully analyzed what s going on. [As a side note, if you build such a NDK, don't use mpfr 3.1.0, as there is a bug in the libtool it ships] Linking an application or a library pulls many things, that aren t part of the code being built. One of these many things is the libgcc static library. Part of libgcc consists in an implementation of the platform ABI. On Android systems, this means the ARM EABI. GCC, when compiling some instructions, will generate ABI calls. For example, integer divisions may call __aeabi_idiv. Consider the following minimized real world scenario:
$ echo "int foo(int a) return 42 % a; " > foo.c
$ arm-linux-androideabi-gcc -o libfoo.so -shared foo.c -mandroid
GCC will emit a call to __aeabi_idivmod for the % operation. With GCC 4.6.3, this function is in _divsi3.o under libgcc.a. That function itself calls __aeabi_idiv0, which lives in _dvmd_lnx.o under libgcc.a. When statically linking, ld will thus include foo.o, _divsi3.o and _dvmd_lnx.o, meaning it will include all functions from these object files. That is, foo, __divsi3, __aeabi_idiv, __aeabi_idivmod, __aeabi_idiv0 and __aeabi_ldiv0. And more than being included, these functions are exported, because symbol visibility in libgcc.a is default. So while we expect exporting foo from our library, we re actually exporting much more, including functions that just happened to be near the ones that our code (indirectly) uses. Now, let s say we want to build another library, using that foo function from libfoo:
$ cat > bar.c <<EOF
extern int foo(int a);
long long bar(long long a) return foo(a) % a;
EOF
$ arm-linux-androideabi-gcc -o libbar.so -shared bar.c -mandroid
(The code above has absolutely no meaning, it just triggers the same function calls as what I was getting in the actual real world case) When statically linking the above code, GCC will generate a call to __aeabi_ldivmod, which calls __aeabi_ldiv0, and many other things, directly or indirectly. When linking as above, nothing particularly nasty is going to happen. However, linking as above is actually wrong: the resulting library has an undefined reference to the foo symbol, and doesn t depend on libfoo. At runtime, if libfoo wasn t already loaded somehow, loading libbar would fail. The proper way to link is the following:
$ arm-linux-androideabi-gcc -o libbar.so -shared bar.c -mandroid -L. -lfoo
A feature of ELF static linking is that when it resolves undefined symbols, the linker will choose to use the first occurrence of a symbol it finds in the various objects and libraries given on its command line. So with the command line above, for each __aeabi_* symbol, it will first look in libfoo if there isn t one. And while __aeabi_ldivmod is not in libfoo, __aeabi_ldiv0 is (see above). So instead of including the code for __aeabi_ldiv0 from libgcc.a, it will call the copy from libfoo. This wouldn t be so much of a problem if __aeabi_ldiv0 wasn t a weak symbol. Enters faulty.lib. In the real world case, libfoo is loaded by the system dynamic linker, and libbar by faulty.lib. When resolving symbols for libbar, faulty.lib has to resolve libfoo symbols with the system linker, using dlsym(). On Android, dlsym() returns NULL for weak (defined) symbols, so faulty.lib can t resolve __aeabi_ldiv0. The real world case wasn t a problem with GCC 4.4.3 from the vanilla Android NDK because in that GCC version, __aeabi_ldivmod doesn t call __aeabi_ldiv0. This wouldn t happen if shared libraries wouldn t expose random platform ABI specific bits depending on what they use and depending on other symbols that happen to be in the same object files. A similar issue happened a little while ago on Debian powerpc because a shared library was exporting ABI specific bits. Even worse, the toolchain was assuming the symbols would come from libgcc.a and generated wrong relocations for these symbols. Update: Interestingly, the __aeabi_* symbols are hidden, in libgcc.a as provided on the Debian armel port.

1 March 2012

Mike Hommey: Introducing faulty.lib

TL;DR link: faulty.lib is on github. Android applications are essentially zip archives, with the APK extension instead of ZIP. While most of Android is java, and java classes are usually loaded from a ZIP archive (usually with the JAR extension), Android applications using native code need to have native libraries on the file system. These native libraries are found under /data/data/$appid/lib, where $appid is the package name, as defined in the AndroidManifest.xml file. So, when Android installs an application, it puts that APK file under /data/app. Then, if the APK contains native libraries under a lib/$ABI subdirectory (where $ABI is armeabi, armeabi-v7a or x86), it also decompresses the files and places them under /data/data/$appid/lib. This means native libraries are actually stored twice on internal flash: once compressed and once decompressed. This is why Chrome for Android takes almost 50MB of internal flash space after installation. Firefox for Android used to have that problem, and we decided we should stop doing that. Michael Wu thus implemented a custom dynamic linker, which would load most of Firefox libraries directly off the APK. This involves decompressing the zipped data somewhere in memory, and doing ld.so s job to make the library usable (please note that on Android, ld.so is actually named linker). There were initially circumstances under which we would decompress into a file and reuse it the next time Firefox starts, but we subsequently removed that possibility (except for debugging purpose) because it ended up being slower than decompressing each time (thanks to internal flash being so slow). Anyways, in order to do ld.so s job, our custom linker was directly derived from Android s system linker, with many tweaks. This custom linker has done its job quite well for some time, now, but has been recently replaced, see further below. Considering Firefox can t do anything useful involving Gecko until its libraries are loaded, in practice, this means Firefox can t display a web page faster than completely decompressing the libraries. Or can it? Just don t sit down cause I ve moved your chair We know that a lot of code and data is not used during Firefox startup. Based on that knowledge, we started working on only loading the necessary bits. The core of the idea is, when a library is requested to be loaded, to reserve anonymous memory for its decompressed size, and that s all. That memory is protected such that any access to it triggers a segmentation fault. When a segmentation fault happens, the required bits are decompressed, and execution is resumed where it was before the segmentation fault. The original prototype was decompressing from a normal zip deflated stream, which means it was impossible to seek in it. So, if an access was made at the end of the library, it was necessary to decompress the whole library. With some nasty binary reordering, and some difficulty, it was possible to avoid accessing the end of the library, but the approach is very much fragile. It only takes an unexpected code path to make things much slower than they should be. Consequently, for the past months, I ve been working on improving the original idea and, with some assistance from Julian Seward, implemented the scheme with seekable compressed streams. Instead of letting the zip archive creation tool deflate libraries, we store specially crafted files. Essentially, files are cut in small chunks, and each chunk is compressed individually. This means a less efficient compression, but it also means random access to chunks is possible. However, instead of stacking on top of our existing custom linker, I started over, from the ground up. First, because it needed a serious clean up (a good part of linker.c is leftovers from the Android linker that we don t use, and APKOpen.cpp is a messy mix of JNI stubs, library decompression handling (which in itself was also a mess) and Gecko initialization code) and most importantly, because it relied on some Android system linker internals and thus required binary compatibility with the system linker. Which, according to Google engineers that contacted us a few months ago, was going to break in what we now know will be called Android Jelly Bean. The benefit of the clean slate approach is that the new code is not tied to Gecko at all and was designed to work on Android as well as on desktop Linux systems (which made debugging much much easier). We re thus releasing the code as a separate project: faulty.lib. It is licensed under the Mozilla Public License version 2.0. Please feel free to try, contribute, and/or fork it. This dynamic linker is not meant to completely follow standard ELF rules (most notably for symbol resolution), and as a result does some assumptions. It s also still work in progress, with some obvious optimizations pending (like, avoiding to resolve the same symbols again and again during relocations), and some features missing (for example, symbol versioning). The next blog post will give some information about how to build Firefox for Android to benefit from on-demand decompression. I will also detail a few of the tricks involved in this dynamic linker in subsequent blog posts.

Next.

Previous.